Part Number Hot Search : 
320C14 LF19N250 NJM2293D ST63P56 V375B S1010 ATMEGA64 MMB251W
Product Description
Full Text Search
 

To Download 328209-001EN Datasheet File

  If you can't view the Datasheet, Please click here to try to view without PDF Reader .  
 
 


  Datasheet File OCR Text:
  reference number: 328209-001EN intel ? xeon phi? coprocessor datasheet november, 2012
2 reference number: 328209-001EN information in this document is provided in connection with intel? products. except as provided in intel's terms and conditions of sale for such products, intel assumes no liability whatsoever, and intel disclaims any express or implied warranty relating to sale and/or use of intel products, including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright, or other intellectual property right. unless otherwise agreed in writing by intel, the intel products are not designed nor intended for any application in which the failure of the intel product could create a situation where personal injury or death may occur. this document contains information on products in the design phase of development. the information here is subject to change without notice. do not finalize a design with this information. contact your local intel sales office or your distributor to obtain the latest specification before placing your product order. all products, dates, and figures are preliminary for planning purposes and are subject to change without notice. the products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. current characterized errata are available on request. copies of documents which have an order number and are referenced in this document, or other intel literature may be obtained by calling 1-800-548-4725 or visiting intel's website at http://www .intel.com. confidential documents are only available via intel business link at http://www .intel.com/ibl. intel, xeon, xeon phi, and the intel logo are trademarks of intel corporation in the u.s. and/or other countries. *other names and brands may be claimed as the property of others. copyright ? 2012, intel corporation. all rights reserved.
reference number: 328209-001EN 3 table of contents 1 introduction ..............................................................................................................7 1.1 reference documentation.....................................................................................7 1.2 conventions and terminology ...............................................................................7 1.2.1 terminology ............................................................................................7 2 intel? xeon phi? coprocessor architecture ..............................................................9 2.1 intel? xeon phi? coprocessor board overview .......................................................9 2.1.1 intel? xeon phi? coprocessor board design ............................................. 10 2.1.2 system management controller (smc) ...................................................... 11 2.1.3 intel? xeon phi? coprocessor silicon....................................................... 12 2.1.4 intel? xeon phi? coprocessor product family ........................................... 13 2.1.5 intel? xeon phi? coprocessor dense form factor...................................... 13 3 thermal and mechanical specification ..................................................................... 15 3.1 mechanical specifications ................................................................................... 15 3.2 intel? xeon phi? coprocessor thermal specification ............................................. 18 3.2.1 intel? xeon phi? coprocessor thermal management.................................. 18 3.3 intel? xeon phi? coprocessor thermal solutions.................................................. 19 3.3.1 3100 series active cooling solution ......................................................... 20 3.3.2 se10p/5110p/3100 series passive cooling solution .................................... 21 3.4 cooling solution guidelines for se10x ................................................................. 26 3.4.1 thermal considerations........................................................................... 26 3.4.2 mechanical considerations ....................................................................... 30 3.4.3 mechanical shock and vibration testing..................................................... 33 3.5 intel? xeon phi? coprocessor pci express* card extender bracket installation ........ 33 3.5.1 step 0: determine lid type ..................................................................... 34 3.5.2 step 1: overlap lid removal.................................................................... 35 3.5.3 step 2: oem bracket installation .............................................................. 36 3.5.4 step 3: replace lid on ?overlap lid? units only.......................................... 37 4 intel? xeon phi? coprocessor pin descriptions ...................................................... 39 4.1 pci express* signals ......................................................................................... 39 4.1.1 prochot_n (pin b12) ............................................................................ 40 4.2 supplemental power connector(s) ....................................................................... 41 5 power specification and management ..................................................................... 43 5.1 5110p sku power options .................................................................................. 43 5.2 intel? xeon phi? coprocessor power states......................................................... 44 5.3 p-states and turbo mode .................................................................................... 48 6 manageability .......................................................................................................... 51 6.1 intel? xeon phi? coprocessor manageability architecture ...................................... 51 6.2 system management controller (smc) ................................................................. 51 6.3 general smc features and capabilities ................................................................. 53 6.3.1 catastrophic shutdown detection ............................................................. 54 6.4 host / in-band management interface (mmio) ...................................................... 54 6.5 system and power management .......................................................................... 56 6.6 out of band / pci express* smbus / ipmb management capabilities ........................ 57 6.6.1 ipmb protocol ........................................................................................ 58 6.6.2 polled master-only protocol ..................................................................... 58 6.6.3 supported ipmi commands ..................................................................... 61 6.7 smc led_error and fan pwm ........................................................................... 69
4 reference number: 328209-001EN list of figures 2-1 intel? xeon phi? coprocessor board schematic ..................................................... 9 2-2 intel? xeon phi? coprocessor board top side...................................................... 10 2-3 intel? xeon phi? coprocessor board, back side ................................................... 11 2-4 intel? xeon phi? coprocessor silicon layout ....................................................... 12 3-1 location of mounting holes on the intel? xeon phi? coprocessor (in mils) .............. 16 3-2 dimensions of intel? xeon phi? coprocessor (dimension in mils) ........................... 17 3-3 entering and exiting thermal throttling (prochot) .............................................. 19 3-4 exploded view of 3100 series active solution ....................................................... 20 3-5 exploded view of passive thermal solution .......................................................... 21 3-6 airflow requirement vs. inlet temperature for the 5110p/3100 series passive cards . 23 3-7 airflow requirement vs. inlet temperature for the se10p passive card .................... 24 3-8 airflow requirement vs. inlet temperature for the 5110p card with 245w tdp ......... 25 3-9 se10x power profile for coprocessor intensive workload (all values in watts)........... 26 3-10 se10x power profile for memory intensive workload (all values in watts)................. 27 3-11 se10x sku coprocessor junction temperature (tjunction) vs power ....................... 28 3-12 se10x sku coprocessor case temperature (tcase) vs power ................................. 29 3-13 se10x board top side ....................................................................................... 31 3-14 se10x board bottom side .................................................................................. 32 3-15 contents of intel? xeon phi? coprocessor package shipment................................ 34 3-16 overlap lid ...................................................................................................... 34 3-17 clearance lid ................................................................................................... 35 3-18 overlap lid removal ......................................................................................... 35 3-19 tilt overlap lid and slide as shown to disengage tabs........................................... 36 3-20 oem bracket installation.................................................................................... 36 3-21 oem bracket installation.................................................................................... 37 3-22 replace lid on ?overlap lid? units....................................................................... 37 3-23 replace lid on ?overlap lid? units (cont.) ............................................................ 38 5-1 coprocessor in c0-state and memory in m0-state .................................................. 45 5-2 some cores are in c0-state and other cores in c1-state; memory in m0-state ........... 45 5-3 all cores in c1 state; memory in m1 state ........................................................... 46 5-4 all cores in package-c3 state; memory in m1 ...................................................... 46 5-5 package-c3 and memory m2 state....................................................................... 47 5-6 package-c6 and memory m2 state....................................................................... 47 5-7 package-c6 and memory m3 state....................................................................... 48 5-8 intel? xeon phi? coprocessor p-states ............................................................... 49 6-1 intel? xeon phi? coprocessor system manageability architecture........................... 52 6-2 schematic representation of prochot_n on the intel? xeon phi? coprocessor ....... 54 6-2 example of using power limits ........................................................................... 56 6-3 write block command diagram .......................................................................... 60 6-4 read block command diagram ........................................................................... 61
reference number: 328209-001EN 5 list of tables 1-1 related documents..............................................................................................7 1-2 general terminology............................................................................................7 2-1 intel? xeon phi? coprocessor product family ...................................................... 13 3-1 intel? xeon phi? coprocessor mechanical specification ......................................... 15 3-2 intel? xeon phi? coprocessor thermal specification ............................................. 18 3_3 component thermal specification on se10x ......................................................... 27 3_4 board component heights .................................................................................. 30 3_5 dynamic load shift specification ......................................................................... 33 4-1 pci express* connector signals on the intel? xeon phi? coprocessor ..................... 39 5-1 intel? xeon phi? coprocessor power states......................................................... 43 5-2 hpc linpack power guidelines............................................................................. 44 6-1 smbus write commands .................................................................................... 60 6-2 miscellaneous command details .......................................................................... 61 6-3 fru related command details ............................................................................ 62 6-4 sdr related command details............................................................................ 62 6-5 sel related command details ............................................................................ 62 6-6 sensor related command details ........................................................................ 63 6-7 general command details .................................................................................. 63 6-8 cpu package config read request format ............................................................ 63 6-9 cpu package config read response format .......................................................... 64 6-10 cpu package config write request format ........................................................... 64 6-11 cpu package config write response format ......................................................... 64 6-12 set sm signal request format ............................................................................ 65 6-13 set sm signal response format .......................................................................... 65 6-14 oem command details....................................................................................... 65 6-15 set fan pwm adder command request format ..................................................... 66 6-16 set fan pwm adder command response format ................................................... 66 6-17 get post register request format ...................................................................... 66 6-18 get post register response format .................................................................... 66 6-19 assert forced throttle request format ................................................................. 67 6-20 assert forced throttle response format ............................................................... 67 6-21 enable external throttle request format .............................................................. 67 6-22 enable external throttle response format ............................................................ 67 6-23 list of sensor names on the intel? xeon phi? coprocessor .................................... 68 6-24 status sensor report format .............................................................................. 69 6-1 led indicators .................................................................................................. 70
6 reference number: 328209-001EN revision history document number revision number description date 328209-001 001 ? first release of datasheet november 2012
document number: 328209-001EN 7 1 introduction 1.1 reference documentation table 1-1 lists most of the applicable documents. for complete list of documentation, contact your local intel representative or go to www.intel.com. 1.2 conventions and terminology 1.2.1 terminology this section provides the definitions of some of the terms used in this document. table 1-1. related documents document www.intel.com pci express* card electromechanical specification, revision 2.0, april 11, 2007 www.pcisig.com n/a intel? xeon phi? coprocessor specification update 328205-001en intel? xeon phi? coprocessor safety compliance guide 328206-001en intel? xeon phi? coprocessor system software developer?s guide 328207-001en intel? xeon phi? coprocessor thermal mechanical models 328208-001en intel? xeon phi? coprocessor instruction set reference manual http://software.intel.com/en-us/forums/intel-many-integrated-core/ n/a table 1-2. general terminology terminology definition bga ball grid array bmc baseboard management controller dff dense form factor ecc error correction code gddr graphics double data rate ibp intel business portal ipmb intelligent platform management bus ipmi intelligent platform management interface me manageability engine pcie pci express* ras reliability accessibility serviceability sku stock keeping unit smbus system management bus
8 document number: 328209-001EN smc system management controller tdp thermal design power vr voltage regulator table 1-2. general terminology terminology definition
reference number: 328209-001EN 9 2 intel ? xeon phi? coprocessor architecture 2.1 intel ? xeon phi? coprocessor board overview the intel ? xeon phi? coprocessor consists of the following primary subsystems. ? many integrated core (mic) coprocessor and gddr5 memory. ? system management controller (smc), on-board thermal sensors (inlet air, outlet air, coprocessor and single gddr5 sensor) and fan (only on the 3100 series; see sku matrix table 2-1 ). ? voltage regulators (vrs) powered by the motherboard through the pci express* connector, a 2x4 (150w) and a 2x3 (75w) power connector on the east edge of the card. along with power through the pci express* connector, the 300w skus need both 2x4 and 2x3 connectors to be driven by system power supplies. the 225w sku may be powered only through the pci express* connector and the 2x4 connector. figure 2-1. intel ? xeon phi? coprocessor board schematic 1 notes: 1. on-board fan is available on intel ? xeon phi? coprocessor product with 3100 series active sku only.
intel? xeon phi? coprocessor architecture 10 reference number: 328209-001EN ? pci express* connections. ? the clock system is integrated in the coprocessor and requires only the pci express* 100mhz reference clock and an on-board 100mhz +/- 50ppm reference. the intel ? xeon phi? coprocessor provides the following high-level features: ? a many-core coprocessor. ? maximum 16-channel gddr memory interface with an option to enable ecc. ? pci express* x16 lane gen2 interface with optional smbus management interface. ? node power and thermal management, including power capping support. ? +12v power monitoring and on-board fan pid controller on the 3100 series active sku. ? on-board flash device that loads the coprocessor os on boot. ? card level ras features and recovery capabilities. 2.1.1 intel ? xeon phi? coprocessor board design the intel ? xeon phi? coprocessor is a pci express* compliant, 246mm x 111mm, high-power add-in card. it supports a maximum of 16 gddr memory channels, distributed on both sides of the pcb. each memory channel supports two 16-bit wide gddr device (for a maximum of 32 devices in the card), combining to give 32-bit wide data. figure 2-2 and figure 2-3 show the front and back sides of the pcb. the two notches along the top edge of the card are used to attach the cooling plate for the gddr devices on the backside of the pcb (side without the intel ? xeon phi? coprocessor silicon). the vrs are split right and left to help reduce direct current resistance and current density. intel ? xeon phi? coprocessor supports 2 power groups for a total of 4 primary low- voltage rails: group a contains vddg, vddq, and vsfr, while group b contains vccp. the vccp, vddg, and vddq rails are powered from the pci express* edge connector and the auxiliary 12v inputs. the vsfr rail is powered from the pci express* edge connector 3.3v input (~5w). vccp is the coprocessor core voltage rail, while vddq, vddg and vsfr supply power to memory, portions of the coprocessor and miscellaneous circuitry on the card. figure 2-2. intel? xeon phi? coprocessor board top side
reference number: 328209-001EN 11 intel? xeon phi? coprocessor architecture note: figure 2-2 and 2-3 are representative of the final intel ? xeon phi? coprocessor board without the package thermal and mechanical elements. 2.1.2 system management controller (smc) the smc has three i2c interfaces. this allows a direct connection to the coprocessor i2c interface, an intracard i2c sensor bus and a system smbus for clean integration with platform and power management systems. the interface between the smc and coprocessor interface is used for coprocessor thermal and status information exchange. the sensor bus allows board thermal, input power, and current sense monitoring for fan and power control. this information can be forwarded to the coprocessor for power state control. the smbus is used by system for chassis fan control with the passive heat sink card and for integration with the node management controller in the platform. communication with the system baseboard management controller (bmc) or peripheral control hub (pch) occurs over the smbus using the standard ipmb protocol. see chapter on manageability for more details figure 2-3. intel? xeon phi? coprocessor board, back side
intel? xeon phi? coprocessor architecture 12 reference number: 328209-001EN 2.1.3 intel ? xeon phi? coprocessor silicon figure 2-4 is a conceptual drawing of the general structure of the intel ? xeon phi? coprocessor architecture, and does not imply actual distances, latencies, etc. the cores, pcie interface logic, and gddr5 memory controllers are connected via an interprocessor network (ipn) ring, which can be thought of as independent bidirectional ring. the l2 caches are shown here as slices per core, but can also be thought of as a fully coherent cache, with a total size equal to the sum of the slices. information can be copied to each core that uses it to provide the fastest possible local access, or a single copy can be present for all cores to provide maximum cache capacity. the intel? xeon phi? coprocessor can support upto 61 cores (making a 31 mb l2) cache) and 8 memory controllers with 2 gddr5 channels each. the maximum number of cores and total card memory varies with intel? xeon phi? coprocessor sku; refer to the intel? xeon phi? pci express* card specification update for information. communication around the ring follows a shortest distance algorithm (sda). co- resident with each core structure is a portion of a distributed tag directory. these tags are hashed to distribute workloads across the enabled cores. physical addresses are also hashed to distribute memory accesses across the memory controllers. figure 2-4. intel? xeon phi? coprocessor silicon layout
reference number: 328209-001EN 13 intel? xeon phi? coprocessor architecture 2.1.4 intel? xeon phi? coprocessor product family 2.1.5 intel? xeon phi? coprocessor dense form factor the intel? xeon phi? coprocessor dense form factor (dff) is a derivative of the standard intel? xeon phi? coprocessor pci express* form factor card. the high-level features of dff are: ? 117.35mm(4.62?) x 149.86mm(5.9?) pcb. ? 230-pin unique edge finger designed to industry standard x24 pci express* connector, pci express* gen2 compliant. ? all power to the card is supplied through the connector. ? there is no auxilliary 2x4 or 2x3 power connector on the card ? supports vertical, straddle or right-angle mating connectors. ? on board smc. the manageability features and software capabilities remain the same as for other intel? xeon phi? coprocessor products. ? to allow for system design innovation and differentiation, intel will ship only the assembled and fully functional pcb, without heatsink or chassis retention mechanism. this allows system designers to implement their own cooling solution and connector of choice. due to presence of gddr5 memory components on the backside of the dff board, a custom cooling design must comprehend both sides of the dff product. ? baseboard designers must ensure the signal integrity of all pci express* signals as they pass the connector of choice and reach the connector fingers of the dff product. table 2-1. intel? xeon phi? coprocessor product family sku card tdp (watts) cooling solution 1 se10p 300 passive se10x 300 none 2 5110p 3 225 passive 3100 series 300 passive, active dense form factor tbd none 4 notes: 1. passive cooling solution uses topside heatsink (vapor chamber and copper fins) and backside aluminum plate. active cooling uses on-card dual-intake blower. 2. same performace and card configuration as the se10p but without intel heatsink or chassis retention mechanism; allows for custom thermal and mechanical design by users. 3. refer to section 5.1 . 4. dense form factor (dff): smaller physical footprint than the other intel? xeon phi? coprocessor products, for innovative platform designs with unique pci express* interface, pci express* gen2 specification compliant.
intel? xeon phi? coprocessor architecture 14 reference number: 328209-001EN
reference number: 328209-001EN 15 3 thermal and mechanical specification 3.1 mechanical specifications the mechanical features of the intel ? xeon phi? coprocessor are compliant with the pci express* 225w/300w high power card electromechanical specification 1.0. table 3-1 shows the mechanical specifications of intel ? xeon phi? coprocessor passive and active cards. figure 3-1 shows the mounting holes and figure 3-2 shows the relevant dimensions of the intel ? xeon phi? coprocessor passive and active cards for chassis retention. refer to the intel? xeon phi? pci express* card thermal and mechanical models for pro- engineering, icepak and flotherm models. table 3-1. intel ? xeon phi? coprocessor mechanical specification parameter specification card length 247.9mm 1 notes: 1. inclusive of i/o bracket primary side height keep-in 34.8mm secondary side height keep-in 2.67mm total card mass (3100 series active sku) total card mass (all passive skus) 1400g 1200g
thermal and mechanical specification 16 reference number: 328209-001EN figure 3-1 location of mounting holes on the intel ? xeon phi? coprocessor (in mils)
reference number: 328209-001EN 17 thermal and mechanical specification figure 3-2 dimensions of intel ? xeon phi? coprocessor (dimension in mils)
thermal and mechanical specification 18 reference number: 328209-001EN 3.2 intel ? xeon phi? coprocessor thermal specification 3.2.1 intel ? xeon phi? coprocessor thermal management thermal management on the intel? xeon phi? coprocessor card is achieved through a combination of coprocessor based sensors, card level sensors and inputs, and a coprocessor frequency control circuit. reducing card temperature is accomplished by adjusting the frequency of the coprocessor. lowering the coprocessor frequency will reduce the power dissipation and consequently the temperature. the coprocessor carries in it a factory calibrated digital temperature sensor (dts) that monitors coprocessor temperature. data from this sensor is available to the bmc or other system software via both in-band (direct software reads) and out-of-band (over the pci express* smbus) interface. refer to chapter titled ?manageability? for more information on how to access dts information. system management software can use this data to monitor the silicon temperature and take any appropriate actions. systems that adjust airflow based on component temperatures must monitor the coprocessor?s dts to ensure sufficient cooling is always available. in addition to making thermal information available to system manageability software, the dts is constantly comparing the coprocessor temperature to the factory set maximum permissible temperature called t throttle . if the measured temperature at any time exceeds t throttle (a state also known as prochot), then the coprocessor will automatically step down the operating frequency (or pstate) in an attempt to reduce the temperature (this is often referred to as ?thermal throttling?). once the temperature has dropped below t throttle , the frequency will be brought back up to the table 3-2. intel ? xeon phi? coprocessor thermal specification parameter specification assumed t amb (chassis external) 35c assumed t rise to card inlet 10c assumed card t inlet 20c max card t inlet 45c max card t exhaust 70c t case of coprocessor 92c t case of gddr 85c t control ~82c 1 t throttle ~104c 2 t thermtrip ~t throttle + 20c 3 notes: 1. t control is the setpoint at which the system fans must ramp up towards full power (or rpm) to maintain the intel? xeon phi? coprocessor temperature around t contro l and prevent throttling. it is highly recommended that the system bmc query the smc on the coprocessor card for accurate tcontrol value. 2. when the coprocessor junction temperature (t junction ) reaches t throttle due to insufficient cooling, the smc will force thermal throttle resulting in the lowest frequency of 800mhz in an attempt to reduce the power and cool down the coprocessor. 3. if the coprocessor temperature continues to rise beyond t throttle and approached t thermtrip , it will result in card shutdown to prevent damage to the coprocessor. the vrs will be shut down and power to the card must be recycled. t thermtrip should not be considered a specification; it can change between skus, and is given here to as guidance.
reference number: 328209-001EN 19 thermal and mechanical specification original setting. see figure 3-3 below. within 50ns of detecting t throttle , the dts circuit begins stepping down the p-states until pn is reached. each frequency step is approximately 100mhz; the exact value will depend on the starting frequency. after each step, the dts will wait 10us before taking the next step. the number of steps, or p-states, depends on the starting frequency and the minimum frequency supported by the processor. once pn is reached, the frequency will be held at that level for approximately 1ms, or until the temperature has dropped below tprochot, whichever is longer. if throttling continues for more than 100msec, the coprocessor os will reduce the voltage setting in order to further decrease the power dissipation. the voltage settings are pre-programmed at the factory and cannot be reconfigured. upon removal of the thermal event, the process reverses and the voltage and frequency are stepped back up the p1 state. although the process to reduce frequency is managed by the coprocessor circuits, the sequence to bring the coprocessor back to p1 is controlled by the coprocessor os. as a result, the precise timings of the step changes may be slightly longer than 10us. 3.3 intel ? xeon phi? coprocessor thermal solutions there are two types of thermal solutions to address the intel ? xeon phi? coprocessor power limits: a passive solution for most skus as indicated in table 2-1 (which relies on forced convection airflow provided by the system) and an active solution on the 3100 series active sku (which uses a high performance blower.) the active solution is designed to operate in an 'adjacent card configuration' such that the impedance from a nearby flow blockage is accounted for within the design. both passive and active solutions come with cooling backplates, which are required to augment the stiffness of the intel ? xeon phi? coprocessor in order to counteract the preload applied by the primary side (housing the coprocessor) when assembled, to protect the structural integrity of the coprocessor and gddr packages during a shock event, and to provide a protective cover. given the requirement to dissipate backside gddr heat within the 2.67mm keep-in height prescribed by the pci express* specification, the backplate is designed to transfer the gddr heat from the secondary side via heat pipes to the primary side thermal solution. figure 3-3 entering and exiting thermal throttling (prochot)
thermal and mechanical specification 20 reference number: 328209-001EN 3.3.1 3100 series active cooling solution for the 3100 series active sku, the intel ? xeon phi? coprocessor thermal-mechanical solution utilizes a supersink approach in which a primary heatsink is used to cool the coprocessor while a metallic fuselage/supersink cools the vr and gddr components. figure 3-4 illustrates the key components of the active cooling design. in the fuselage/supersink approach, the duct is metallic and performs both structural and thermal roles. in its 'fuselage' function, the duct provides structural support for the forces generated by the coprocessor thermal interface, protects against shock events, and channels airflow through the card. in its 'supersink' function, the duct contains internal fins, heat pipes, and diecast blower frame. the internal heat pipes serve to transmit heat from gddr (both top- and bottom-side) and vr components to the internal fin banks, diecast blower frame, and metal fuselage structure where it can be effectively transferred to the airstream. the duct also contains horizontal webs which interface to the east and west gddr as well as to the vr fets. together, these structures dissipate heat lost from the gddr and vr components into the air. the coprocessor thermal path is separated from the gddr and vr components, and utilizes a heatsink with parallel plate fins and vapor chamber base. the active solution also contains a high-performance dual-intake blower that operates up to 5400 rpm at 20w of motor power. the blower has been designed to maximize the pressure drop capability and is able to deliver up to 35 ft 3 /min in an open airflow environment. when installed on the card, the blower delivers 31 ft 3 /min with no figure 3-4 exploded view of 3100 series active solution
reference number: 328209-001EN 21 thermal and mechanical specification adjacent blockage. when an adjacent card is considered, the resultant impedance loss causes the flow rate to drop to 23 ft 3 /min. the active thermal solution is designed to provide sufficient cooling even in the latter scenario. 3.3.2 se10p/5110p/3100 series passive cooling solution for the passive heatsink on the se10p/5110p/3100 series skus, the intel ? xeon phi? coprocessor thermal & mechanical solution also utilizes a 'fuselage/supersink' approach. figure 3-5 illustrates the key components of the passive design. as in the active thermal solution, the duct is metallic and performs both structural and thermal roles. in its 'fuselage' function, the duct provides structural support for the forces generated by the cpu thermal interface, protects against shock events, and channels airflow through the card. in its 'supersink' function, the duct contains internal fins and heat pipes. the internal heat pipes serve to transmit heat from gddr (both top- and bottom-side) and vr components to the internal fin banks, diecast blower frame, and metal fuselage structure where it can be effectively transferred to the airstream. the passive solution does not have a diecast blower frame as it relies upon forced airflow from the host system. in place of the blower and frame, an additional fin bank is added to dissipate waste heat from gddr and vr components. the fin spacing of all fin banks as well as of the cpu heat sink fin bank have been optimized for receiving system-supplied airflow. a backplate stiffener/heat sink is used. figure 3-5 exploded view of passive thermal solution
thermal and mechanical specification 22 reference number: 328209-001EN 3.3.2.1 system airflow for 5110p/3100 series passive skus in order to ensure adequate cooling of the 5110p/3100 series passive skus with a 45 o c inlet temperature, the system must be able to provide 20 ft 3 /min of airflow to the card with 4.3 ft 3 /min on the secondary side and the remainder on the primary side. the total pressure drop (assuming a multi-card installation conforming to the pci express* mechanical specification) is 0.21 inch h 2 o at this flow rate. note: for systems with reversed airflow, the corresponding airflow requirement is expected to be within +/-5% tolerance of the values shown in the following tables. if the system is able to provide a temperature lower than 45 o c at the card inlet, then the total airflow can be reduced according to the graph and table in figure 3-6 . if the 5110p sku is powered by a 2x4 and a 2x3 connector, the card can support an additional 20w of power for maximum tdp of 245w (see section 2.1.5 for more details). in this case, the corresponding airflow requirement for cooling the part as a 245w card is shown in figure 3-8 . 3.3.2.2 airflow requirement for se10p passive cooling solution in order to ensure adequate cooling of the se10p 300w sku with a 45 o c inlet temper- ature, the system must be able to provide 33 ft 3 /min of airflow to the card with 7.2 ft 3 / min on the secondary side and the remainder on the primary side. the total pressure drop (assuming a multi-card installation conforming to the pci express* mechanical specification) is 0.54 in h 2 o at this flow rate. if the system is able to provide a temperature lower than 45 o c at the card inlet, then the total airflow can be reduced according to the graph and table in figure 3-7 .
reference number: 328209-001EN 23 thermal and mechanical specification figure 3-6 airflow requirement vs. inlet temperature for the 5110p/3100 series passive cards
thermal and mechanical specification 24 reference number: 328209-001EN figure 3-7 airflow requirement vs. inlet temperature for the se10p passive card
reference number: 328209-001EN 25 thermal and mechanical specification figure 3-8 airflow requirement vs. inlet temperature for the 5110p card with 245w tdp
thermal and mechanical specification 26 reference number: 328209-001EN 3.4 cooling solution guidelines for se10x the intel? xeon phi? coprocessor se10x sku is shipped without a thermal solution, which gives system designers and integrators an opportunity to fit this sku into their custom designed chassis. this sku has gddr components on the back side that must be cooled, in addition to the front side where the coprocessor resides.this section documents thermal and mechanical specifications and guidelines that would be useful to developers of custom designs. 3.4.1 thermal considerations figure 3-9 and figure 3-10 show a schematic representation of the power profiles of the intel? xeon phi? coprocessor se10x product. figure 3-9 se10x power profile for coprocessor intensive workload (all values in watts)
reference number: 328209-001EN 27 thermal and mechanical specification table 3_3 shows thermal specifications of components present on the se10x. figure 3-10 se10x power profile for memory intensive workload (all values in watts) table 3_3. component thermal specification on se10x component thermal specification coprocessor t case 92c gddr t case 85c vr fet t junction 150c 1 vr inductor t body 100c notes: 1. while this is the component specification, on the passive and active intel ? xeon phi? coprocessor products, the junction temperature is limited to 135 c in order to prevent damage to the pcb.
thermal and mechanical specification 28 reference number: 328209-001EN the simplest cooling mechanism would involve running fans at full speed. for those custom aircooled solutions that intend to be economical in fan power usage and acoustics, figure 3-11 represents three regions on the se10x coprocessor power consumption curve relevant to system fan control. region (a-b) on the line represents the minimum necessary performance of a cooling solution to keep the coprocessor silicon temperature (t junction ) below t throttle of 104 c ( table 3-2 ), during high power dissipation. in this region, a cooling solution based on airflow would ensure the fans are operating at 100% capacity. in region b-c, the coprocessor power consumption is low enough that the cooling solution may be set to maintain the junction temperature at a target temperature. finally, in region c-d, the coprocessor may need to be cooled to below the target temperature to maintain a reasonable exhaust air temperature. figure figure 3-12 shows the analogous thermal behavior of t case . figure 3-11 se10x sku coprocessor junction temperature (t junction) vs power
reference number: 328209-001EN 29 thermal and mechanical specification for the region a-b, the cooling solution must maintain the case temperature below 95 c which will in turn maintain the coprocessor silicon junction temperature below 104 c. assuming an air-cooled heat sink, at a maximum coprocessor power dissipation of 198w ( figure 3-9 ) and an inlet air temperature of 45 c, the following equation between coprocessor junction-to-case and case-to-air heat sink rating can be used to determine the minimum necessary performance of a cooling system: t junction = jc * cpu power + ca_req * cpu power + t ambient the heat sink must have a ca_req value adequate to keep the coprocessor junction temperature at or below 104 c. the value for jc is a characteristic of the intel ? xeon phi? coprocessor and may be treated as 0.047, a constant. as the coprocessor power level goes down (region b-c), it is desirable to keep the junction temperature at or below a target temperature, here shown at 82 c. since each coprocessor is programmed at the factory with the actual control temperature (t control ), a sophisticated cooling system may continuously read the junction temperature from the card smc and compare it to the programmed t control to adjust airflow. the change in airflow over an air cooled heat sink affects the ca value. it is common to reduce fan speed when maximum airflow is not needed to save power, reduce noise, or both. in the b-c region, even though t junction is at a constant value, t case actually goes up a little bit at lower power consumption levels. this is because a variable fan speed results in a variable ca , but a fixed jc . finally, in the c-d region where the coprocessor consumes very little power, an air cooled heat sink using a variable fan speed to maintain a target junction temperature may slow the airflow down too much. if the airflow is too low, the junction temperature may be maintained properly, but the exhaust air temperature approaches the junction temperature. data center design considerations, including safety, may dictate that a maximum allowable exhaust air temperature, such as 70 c, which in turn will set a maximum limit on ca . figure 3-12 se10x sku coprocessor case temperature (t case) vs power
thermal and mechanical specification 30 reference number: 328209-001EN 3.4.2 mechanical considerations ? in the passive intel ? xeon phi? coprocessor products, the only component on the card with ihs load is the coprocessor. the compressive load is assumed to be approximately uniformly distributed over the ihs. the minimum load is 23lbf and maximum load is 75lbf. the mean pressure on the ihs is 33lbf. ? honeywell ptm3180 is recommended as the thermal interface material (tim). ? the gap filler used is the bergquist 3500s35. ? the intel passive heat sink is designed to nominal gaps of ? gddr: 0.3 +/- 0.1225 mm ? vr fets: 0.511 +/- 0.1225 mm ? vr inductors: 0.5 +/- 0.2 mm table 3_4 shows the maximum heights of the different components on the se10x product, along with the heights used in the product board design. figure 3-13 and figure 3-14 show the front and back sides of the se10xsku. refer to the intel ? xeon phi? coprocessor thermal and mechanical models document for the se10x sku. table 3_4. board component heights block color 1 component height (mils) min typ max coprocessor 171.221 177.992 184.763 gddr orange 47 47.25 vr inductor yellow 217 217 vr phase controller red 35 39.37 coprocessor vr controller green 37 37 gddr vr controller pink 35 35.43 capacitor topside purple 49 49 capacitor backside light blue 83 83 notes: 1. colors are in reference to figure 3-13 and figure 3-14 .
reference number: 328209-001EN 31 thermal and mechanical specification figure 3-13 se10x board top side
thermal and mechanical specification 32 reference number: 328209-001EN figure 3-14 se10x board bottom side
reference number: 328209-001EN 33 thermal and mechanical specification 3.4.3 mechanical shock and vibration testing table 3_5 shows the recommended shock and vibration guidelines, and dynamic load shift specifications. 3.5 intel ? xeon phi? coprocessor pci express* card extender bracket installation intel ? xeon phi? coprocessors are shipped without the pci express* bracket (also known as extender bracket) installed on the card. the purpose of this bracket is to interface with the chassis mechanical card guides for standard full-length pci express* cards. in the shipped package, customers should expect to find: ? 1 intel ? xeon phi? coprocessor with assembled thermal solution. ? 1 intel ? xeon phi? coprocessor card extender bracket. ? 4 m3 x6mm flat head screws. note: the se10x sku is not shipped with the extender bracket. table 3_5. dynamic load shift specification test specification and guidelines board unpackaged shock 50g trapezoidal; v:170in/s drops: 3x each on 6 faces board unpackaged random vibration 5hz @ 0.01g 2 /hz to 20hz @0.02g 2 /hz (slope up) 20hz to 500hz @ 0.02g 2 /hz (flat) input acceleration is 2313g rms 10mins per axis in all 3 axis system unpackaged shock 25g trapezoidal; varies by system weight (20-39lbs: 225 in/sec; 40-79lbs: 205 in/sec) drops: 2x each of 6 faces system unpackaged random vibration 5hz @ 0.001g 2 /hz to 20hz @0.001g 2 /hz (slope up) 20hz to 500hz @ 0.001g 2 /hz (flat) input acceleration is 2.20g rms 10mins per axis in all 3 axis
thermal and mechanical specification 34 reference number: 328209-001EN 3.5.1 step 0: determine lid type if lid type is ?overlap? where lid covers top mounting holes as shown in figure 3-16 , then go to step 1 . if lid type is ?clearance? where lid has cut-out for mounting holes as shown in figure 3- 17 , then go to step 2 . figure 3-15 contents of intel ? xeon phi? coprocessor package shipment figure 3-16 overlap lid
reference number: 328209-001EN 35 thermal and mechanical specification 3.5.2 step 1: overlap lid removal a. remove 2 of the m3x6mm screws retaining the lid, as shown in figure 3-18 . b. remove lid. take care not to bend tabs, as shown in figure 3-19 . figure 3-17 clearance lid figure 3-18 overlap lid removal
thermal and mechanical specification 36 reference number: 328209-001EN 3.5.3 step 2: oem bracket installation a. insert the oem bracket into the intel ? xeon phi? coprocessor card assembly, as shown in figure 3-20 . b. install (4) m3 x 6mm flat head screws; torque = 6inch-lbs, shown in figure 3- 21 . figure 3-19 tilt overlap lid and slide as shown to disengage tabs figure 3-20 oem bracket installation
reference number: 328209-001EN 37 thermal and mechanical specification at this point, ?clearance lid? units are ready to be mounted in the chassis. 3.5.4 step 3: replace lid on ?overlap lid? units only a. insert tabs into slots in card assembly, shown in figure 3-22 . b. install the lid?s screws (m3 x 6mm flat head); torque = 6 inch-lbs, shown in figure 3-23 . figure 3-21 oem bracket installation figure 3-22 replace lid on ?overlap lid? units
thermal and mechanical specification 38 reference number: 328209-001EN figure 3-23 replace lid on ?overlap lid? units (cont.)
reference number: 328209-001EN 39 4 intel ? xeon phi? coprocessor pin descriptions 4.1 pci express* signals the pci express* connector for the intel ? xeon phi? coprocessor is a x16 interface and supports signals defined in the ? pci express ? card electromechanical specification? . not all signals called out in the pci express* specification are utilized on the intel ? xeon phi? coprocessor, and listed as ?not used? in table 4-1 . the symbol _n at the end of a signal name indicates that the active or asserted state occurs when the signal is at a low voltage level. when _n is not present after the signal name, the signal is asserted when at the high voltage level. the following notations are used to describe the signal type: i signal is an input to the intel ? xeon phi? coprocessor o signal is an output from the intel ? xeon phi? coprocessor i/o bidirectional input/output signal s sense pin p power supply signal, sourced from the pci express* edge fingers or supplemental power connectors. table 4-1. pci express* connector signals on the intel? xeon phi? coprocessor signal name signal type description exp_a_tx_[15:0]_dp exp_a_tx_[15:0]_dn o pci express* differential transmit pairs: 16- channel differential transmit pairs, referenced to the intel ? xeon phi? coprocessor. the exp_a_tx_[15:0]_dp and exp_a_tx_[15:0]_dp are connected to the pci express* device transmit pairs on the intel ? xeon phi? coprocessor. exp_a_rx_[15:0]_dp exp_a_rx_[15:0]_dn i pci express* differential receive pairs: 16- channel differential receive pairs referenced to the intel ? xeon phi? coprocessor. the exp_a_rx_[15:0]_dp and exp_a_rx_[15:0]_dp are connected to the pci express* device receive pairs on the intel ? xeon phi? coprocessor. ck_pe_100m_16port_dp ck_pe_100m_16port_dn i pci express* reference clock: 100mhz differential clock i to intel? xeon phi? coprocessor for use by the coprocessor to properly recover data from the pci express* interface.
intel? xeon phi? coprocessor pin descriptions 40 reference number: 328209-001EN 4.1.1 prochot_n (pin b12) system baseboard routing to the prochot_n pin must take into consideration the following details: ? prochot_n pin is driven by the +3.3v power rail. ? prochot_n pin is connected to a pull-down of 100k-ohm on the card. ? the input signal arriving at the pin from the baseboard must meet the following characteristics: rst_pcie_n i pci express* reset signal: rst_pcie_n is a 3.3-volt active-low signal that when deasserted (high) indicates that the +12v and vcc3 power supplies are stable and within their specified tolerance. smb_pci_clk i/o pci express* system management bus clock: smb_pci_clk is the 3.3-volt clock signal for the smbus interface, which is normally used for power and/or thermal management and for monitoring the card. smb_pci_dat i/o pci express* system management bus data: smb_pci_dat is the 3.3-volt data signal for the smbus interface, which is normally used for power and/or thermal management and for monitoring the card. prsnt1_n, prsnt2_n s following pci express* specification, prsnt1_n# (pin a1) is connected on the coprocessor card to prsnt2_n (pin b81). remaining prsnt2_n pins (17, b31, b48) must be unconnected on the baseboard. vcc3 p +3.3v supply: the positive 3.3-volt power supply to the pci express* card. +12v p +12v supply: the positive 12-volt power supply to the pci express* card. v_3p3_pciaux p +3.3vaux supply. prochot_n (pin b12) i on the intel? xeon phi? coprocessor, the smc supports an external path from the baseboard to the card's b12 pin, which allows system agents such as bmc or me to throttle the card in response to card thermal events (thermal throttling). pin b12, defined as reserved in the pci express* specification, has been renamed prochot_n on the intel ? xeon phi? coprocessor and is driven by 3.3v power. this pin is held in active-high state by the card smc, and must be driven active-low by the baseboard to exert throttling. this feature is not available on the 3100 series active sku. see section 4.1.1 and chapter 6 for details. wake_n not used pci express* wake signal. exp_jtag[5:1] not used pci express* jtag interface. table 4-1. pci express* connector signals on the intel? xeon phi? coprocessor signal name signal type description
reference number: 328209-001EN 41 intel? xeon phi? coprocessor pin descriptions ? vih(min)= 2.7v ? vil(max)= 0.5v ? rise/fall times(max)= 240ns ? the baseboard implementation can choose to be either push-pull or open-drain. in particular, an open-drain implementation must provide a pull-up resistor of 10k-ohm or less on the baseboard to counteract the pull-down on the card. 4.2 supplemental power connector(s) the intel ? xeon phi? coprocessor gets only maximum 75w from the pci express* connector, per the pci express* specification. the 2x4 and 2x3 supplemental power connectors on the coprocessor card provide the additional +12-volt power needed by the coprocessor. per the pci express* specifications, the 2x4 connector must be capable of maximum 150w power draw by the coprocessor, and the 2x3 must be capable of maximum 75w power. the 300w tdp products of the intel ? xeon phi? coprocessor family must have provide power to the 2x4 and the 2x3 connectors. the 225w products can have either a single 2x4 connector connected to a power supply, or two 2x3 connectors (each capable of maximum 75w power draw). within the coprocessor, the power rails from the three sources are not connected to each other. instead, the intel ? xeon phi? coprocessor is designed to draw power proportionally from the three power sources. during coprocessor power-up, sensors on the coprocessor card detect presence of power supplies to the supplemental connectors, and depending on the maximum tdp of the coprocessor, can determine if sufficient power is available to power up the card. for example, sensors on a 300w coprocessor card must detect both 2x4 and 2x3 power supplies in order for the card to be powered up and function properly.
intel? xeon phi? coprocessor pin descriptions 42 reference number: 328209-001EN
reference number: 328209-001EN 43 power specification and management 5 power specification and management power management in the intel ? xeon phi? coprocessor is primarily managed via the on-board resident coprocessor os with hardware-controlled functionality. table 5-1 shows estimates for coprocessor power states and respective memory power states, along with estimates of corresponding card power and wakeup times. as the intel ? xeon phi? coprocessor is being powered-on, it is expected to draw measurable amount of current from each of the power rails connected to the coprocessor card. below is the current and power drawn from each source during the power-on phase of the se10p sku: ? +12v 2x4 connector: 5.4a (~64w) a peak current of 7.1a for duration of 40 s ? +12v 2x3 connector: 3a (~36w) ? +12v pci express* slot pins: 1.6a (~20w) ? +3.3v pci express* slot pins: 1.3a (~5w) ? measured total power consumption: ~120w note: the above power-on measurements were taken with a single coprocessor card, using a specific open chassis system. it is not indicative of the coprocessor behavior in all types of systems in which the coprocessor will be used. the current and power values are meant to be guidelines for system power planning, and not specification of the intel ? xeon phi? coprocessor. 5.1 5110p sku power options most hpc applications running on the 5110p sku are expected to draw less than 225w, but the card is designed to support power surges above 225w. if the power surge goes above 236w for more than 300ms, then the smc on the card will instruct the intel? xeon phi? coprocessor to drop its operating frequency by approximately 100mhz, thus table 5-1. intel ? xeon phi? coprocessor power states coprocessor power state memory power state 1 se10p/se10x card power (watts) 5110p card power 2 (watts) wakeup time 3 c0 m0 300 225 n/a c1 m1/m2 <115 <105 <40 <1 s (<5ms m2) auto-pc3 m3 ~10 s (<5ms m3) deep-pc3 m3 ~10-40 s (<5ms m3) pc6 m3 not supported ~4-12 ms (<5ms m3) notes: 1. in many cases, memory power state latencies will govern the overall card wakeup time. 2. refer section 5.1 . 3. wakeup times are shown to provide orders of magnitude comparisons only, and may change based on part characterization.
power specification and management 44 reference number: 328209-001EN reducing power dissipation by approximately 10w. if power surge goes above 245w for more than 50ms, then the smc will assert the prochot_n signal to the coprocessor, which will cause the frequency to drop to the minimum possible value (refer to section 3.2.1 ). the level and duration of the power surge are programmable by the end user (refer chapter on manageability for more details). additionally, there may be applications such as hpc linpack that may draw up to 245w ( table 5-2 ). this should be taken into account when choosing one of the three modes of operation as listed below: ? users can install both the 2x4 and 2x3 power connectors for total available power of 300w. in this case, the card may draw up to 245w of power depending on the application. this mode ensures sufficient power is available and reduces the risk of throttling. users may see power dissipation approach 245w, as applications become more highly tuned to take advantage of the intel? xeon phi? coprocessor architecture. ? users can install either the 2x4 connector only or two 2x3 connectors for total available power of 225w. the card is designed to support power surges of up to 236w. if the power surge goes above 236w for more than 300ms, then the smc on the card will instruct the intel? xeon phi? coprocessor to drop its operating frequency by approximately 100mhz, thus reducing power dissipation by approximately 10w. ? if a greater card power limitation is desired, users can configure the smc to further limit the power draw of the 5110p sku, ensuring compatibility with less capable power delivery systems (refer to section 6.5 ). note: results may change with intel ? xeon phi? coprocessor steppings, frequencies, cluster configurations, changes in the card coprocessor os and workload binaries. ? all workloads provided by intel and run as a native offload application. ? test system setup: dual intel ? xeon ? e5-2680 cpus, 32gb ram; two passive intel ? xeon phi? coprocessor cards (61 core) per intel ? xeon ? processor ? all measurements taken with labview ? signal express @ 25c ambient temperature. ? hpc linpack was run for about 5 minutes to reach thermal saturation before measuring power. 5.2 intel ? xeon phi? coprocessor power states figure 5-1 to figure 5-8 are a schematic representation of the inter-relationship between the different coprocessor and memory power states on intel ? xeon phi? coprocessor. these schematic representations are only for illustrative purposes and do not represent all possible low power states. table 5-2. hpc linpack power guidelines hpc workload card sku card power on workload (w) linpack dp se10p/se10x 300 linpack dp 5110p 245
reference number: 328209-001EN 45 power specification and management in this power state, the card is expected to operate at its maximum tdp rating. note: no application is expected to dissipate maximum power from cores and memory simultaneously. coprocessor c1 state gates clocks on a core-by-core basis, reducing core power. on the active sku, the fan slows to an appropriate speed, reducing fan power. if all cores enter c1, the coprocessor automatically enters auto-pc3 state. figure 5-1. coprocessor in c0-state and memory in m0-state figure 5-2. some cores are in c0-state and other cores in c1-state; memory in m0-state
power specification and management 46 reference number: 328209-001EN if clock-enable input to memory is pulled high, then memory enters m1 state which reduces memory power. when all cores have entered c1 halt state, the coprocessor package can reduce the core voltage and enter deep-pc3. the fan (on active skus) can slow to minimum speed. vrs enter low power mode. figure 5-3. all cores in c1 state; memory in m1 state figure 5-4. all cores in package-c3 state; memory in m1
reference number: 328209-001EN 47 power specification and management from m1 state, memory can be put in self-refresh mode to enter the m2 state, further reducing memory power. the coprocessor os can request that the coprocessor enter package c6 state. core voltage is shut down. coprocessor power is <10w 1 in this state. figure 5-5. package-c3 and memory m2 state figure 5-6. package-c6 and memory m2 state 1. value may be revised following silicon characterization
power specification and management 48 reference number: 328209-001EN the memory clock can be fully stopped, reducing memory power to its minimum state. 5.3 p-states and turbo mode p-states, or performance states, are different frequency settings requested by the host os or application when the cores are in the c0 active/executing state. switching between p-states is done by the coprocessor when the os or application determines that more or less performance is needed. all active cores run at the same p-state frequency as there is only one clock source in the coprocessor. each frequency setting of the coprocessor requires a specific vid voltage setting in order to guarantee proper operation, and each p-state corresponds to one of these frequency and voltage pairs. each device is uniquely calibrated and programmed at the factory with its appropriate frequency and voltage pairs. as a result, it is possible that two devices with the same frequency specification may have different voltage settings. the highest p-state is p1, followed by sequentially lower frequency states of p2, p3?. with pn being the lowest frequency state. all parts within a given sku will have the same p-state settings, but p-state frequencies may vary across skus. figure 5-7. package-c6 and memory m3 state
reference number: 328209-001EN 49 power specification and management figure 5-8. intel? xeon phi? coprocessor p-states
power specification and management 50 reference number: 328209-001EN
reference number: 328209-001EN 51 6 manageability 6.1 intel ? xeon phi? coprocessor manageability architecture the server management and control panel component of the intel ? xeon phi? coprocessor architecture provides a system administrator with the runtime status of the intel ? xeon phi? coprocessor installed in a given system. there are two access methods by which the server management and control panel component may obtain status information from the intel ? xeon phi? coprocessor. the ?in-band? method utilizes the scif network and the capabilities designed into the ? os and the host driver to deliver the intel ? xeon phi? coprocessor status. it also provides a limited ability to set specific parameters that control hardware behavior. the same information can be obtained using the ?out-of-band? method. this method starts with the same capabilities in the ? os, but sends the information to the system management controller (smc) using a proprietary protocol. the smc can then respond to queries from the platform?s bmc using the ipmi protocol to pass the information upstream to the administrator or user. for more information on the tools available for management see the intel ? xeon phi? coprocessor system software developers guide . 6.2 system management controller (smc) intel ? xeon phi? coprocessor manageability relies on a system management controller (smc) on the card. the smc provides sensor telemetry information for management by in-band (host) software and out-of-band software via the pci express* smbus. the smc also provides additional functionality as described in this chapter. the smc is a microcontroller-based thermal management and communications system that provides card-level control and monitoring of the intel ? xeon phi? coprocessor. thermal management is achieved through monitoring the intel ? xeon phi? coprocessor and the various temperature sensors located on the coprocessor card. card-level power management monitors the card input power and communicates current power conditions to the intel ? xeon phi? coprocessor. smc features include: ? four thermal sensor inputs: inlet, outlet, coprocessor die, and gddr. ? power alert, thermal throttle, and thermtrip# signals. the smc connects to coprocessor silicon via the following i2c and out-of-band signals: ? in-band communication ? software access to thermal and power metrics via ganglia ? gmond exposed via standard ethernet port ? accessible via control panel gui and api ? out-of-band communication ? access to the smc via the pci express* smbus using the ipmi ipmb protocol ? 50ms sampling rate for power data
manageability 52 reference number: 328209-001EN the manageability architecture also provides support for the intel ? xeon phi? coprocessor in node manager mode, which adds functionality such as setting power limits and time windows. in operational mode, the smc monitors power and temperatures within the intel ? xeon phi? coprocessor and through sensors located on coprocessor card. this information is then used to control the power consumed by the pci express* card and, in the case of the 3100 series active sku, the rotating speed of the fan on the card. the smc provides status information (temperature, fan speed, and voltage levels) to the intel ? xeon phi? coprocessor drivers, which then can be provided to the end user via a gui. the smc provides a master/slave smbus (using the ipmi ipmb protocol) so that a platform bmc or me can control the smc. the smc on the intel ? xeon phi? coprocessor has the following capabilities: ? general manageability features ? board id and sku definition ? unique identifying number ? fan control ? read fan rpm ? thermal throttling and throttle monitoring ? force throttling of the coprocessor ? monitor time in throttled state ? separated status if power limited throttling vs. overtemperature throttling ? card-level power limiting/capping ? power limit 0 and 1, tracked over separate time windows figure 6-1. intel ? xeon phi? coprocessor system manageability architecture
reference number: 328209-001EN 53 manageability ? p-state clamping if the p-state requested is not possible within the set power envelope ? power/energy measurement ? can choose to include or preclude 3.3v power 6.3 general smc features and capabilities the intel ? xeon phi? coprocessor supports the pci express* 2.0 standard. the smc located on the card has direct access to information about the card operation (such as fan speeds, power usage, etc.) that must be managed from host-based software. the smc supports manageability interfaces via mmio and the preferred pci express* smbus (ipmi ipmb protocol) as well as with polled master only ipmi protocol. the smc firmware update process is resilient against unexpected power loss and resets. the smc supports a read only ipmi compliant fru that contains the following information: ? manufacturer name ? product name ? part number / model number ? uuid ? manufacturer?s ipmi id ? product ipmi id ? manufacturing time / date stamp ? serial number (12 ascii bytes) to keep the intel ? xeon phi? coprocessor within the operational temperature range, the smc boosts the fan to full speed when either perst or thermtrip_n are asserted on the 3100 series active skus with on-board fan. on skus with passive cooling solutions, the smc will sample a gpio pin on startup to determine if the closed loop fan control algorithm and monitoring should be disabled on certain skus. additionally the smc supports enabling and disabling an external assertion path from the baseboard to the card pin b12. this allows an external agent, such as a bmc or me, to force throttle the intel ? xeon phi? coprocessor during thermal events. pin b12, defined as reserved in the pci express* specification, has been renamed prochot_n on intel ? xeon phi? coprocessor and is driven by 3.3v power. this pin is held active- high (deasserted) by design, and must be driven active-low by the baseboard to exert throttling. an oem ipmb message from the baseboard to the smc is required to enable the external throttling mechanism. see section 4.1.1 for baseboard implementation details.
manageability 54 reference number: 328209-001EN 6.3.1 catastrophic shutdown detection catastrophic shutdown is the act of the intel ? xeon phi? coprocessor silicon shutting itself down to prevent damage to the device caused by overheating. the smc monitors thermtrip_n to detect this event. when thermtrip_n is asserted (low), the smc detects this and immediately forces the fan(s) to full speed and shuts down the vrs. removal of power is required to reset the microcontroller to a known start point. 6.4 host / in-band management interface (mmio) manageability, through the smc, is achievable via the pci mmio interface. this allows host programs to obtain mic telemetry and other information from the smc managed features of the intel ? xeon phi? coprocessor itself, as well as control smc enabled functions. the smc supports a host based mmio based interface. the following smc information and sensors are accessible over the mmio-based interface: ? hardware strapping pins ? smc firmware revision number ? uuid ? pci compliant mmio (required for pci compliancy) ? fan tachometer ? fan pwm adder for boosting the fan speed for additional cooling ? smc system event log (sel) ? all registers mentioned in the ganglia support section ? voltage rail discrete monitoring ? all discrete temperature sensors ?t critical ?t control figure 6-2. schematic representation of prochot_n on the intel? xeon phi? coprocessor
reference number: 328209-001EN 55 manageability ?t current ?t control offset adder ? thermal throttle duration due to card power limit (in ms), free running counter that overflows at 60 seconds ?t inlet (derived numbers) ?t outlet (derived numbers) ? perf_status_thermal ? 32-bit post register ? smc sel entry select and data registers (read only) ? smc sdr entry select and data registers (read only - required to interpret the sel) each smc sensor that is exposed over mmio indicates one of four states in a consistent manner, returned in the same register value as the sensor reading itself, regardless of sensor type. these states do not apply to non-sensor information: ? normal ? upper critical ? lower critical ? inaccessible (sensor not available) this minimizes the complexity of host-driven software and smc firmware implementations. the sensors available from the smc vary within the intel ? xeon phi? coprocessor family of products. however, the ipmi sdr sensor names will not change from release to release. t inlet and t outlet are derived numbers based on the inlet and outlet temperature sensors. the sensors located on the intel ? xeon phi? coprocessor relate information about the cpu temperature as well as the temperature from three locations on the intel ? xeon phi? coprocessor. currently, one sensor is located between memory chips near the pci express* slot while the other two are located on the east and west sides of the card. these are sometimes referred to as the ?inlet? and ?outlet? air temperature sensors but they do not actually indicate airflow temperature but rather the temperature of the board. the sensors are attached to the 12 inputs from the pci express* slot, the 2x3 connector, and the 2x4 connector. input power can be estimated by summing the currents over these three connections. for an actively cooled card, the smc can also provide the fan percentage pulse width modulation (pwm) being used. fan speed is a simple pid control with setpoints set rather high to keep the sound level low when max cooling is not needed. there are two programmable power limits. one defaults to 105% of the design power consumption. if this is exceeded, the smc notifies the intel ? xeon phi? coprocessor silicon through an interrupt to reduce power. the second power limit forces throttling of the intel ? xeon phi? coprocessor silicon when the power consumption is above 125% of design.
manageability 56 reference number: 328209-001EN 6.5 system and power management the intel ? xeon phi? coprocessor pci card supports both on-card power management and an option for system-based management. with on-card power management, the smc controls system power using preprogrammed power limits. the ? os can change the preprogrammed power limits if necessary. with system-based management, the smc receives power control inputs via in-band communication from a host application. if the server administrator knows that there is sufficient power in the server infrastructure to support greater than tdp for short time frames, then pl0 can be set to a value greater than the card tdp. if the server administrator knows the that the power infrastructure cannot support power greater than pl1 indefinitely, then pl1 can be set to a value less than tdp. there is no relationship between pl0, pl1, and card tdp other than the fact that pl1 must be less than or equal to pl0. pl0 sets the peak power limit for the card level. this is the moving average power (watts) that can be consumed in time window 0 (set in increments of milliseconds). pl1 sets the peak power limit for the card level. this is the moving average power (watts) that can be consumed in time window 1 (set in increments of milliseconds). pl0 and pl1 are set in increments of 1w. typically, the time window for pl0 is set to less than the time window for pl1. this means that the smc carries a running average for pl0 and pl1. the pl0 running average is calculated over time window 0 and the pl1 average is calculated over time window 1. the smc collects power sensor data at an update rate of time window 0 (or faster) so pl0 is the last data collected from power sensors. the pl1 average is a moving average over time window 1. a power excursion over pl1 for time window 1 may not cause the smc to assert a power alert, assuming the pl1 time window is sufficiently long. figure 6-2 and the following discussion is a representation of how to use the power limits. figure 6-2 example of using power limits
reference number: 328209-001EN 57 manageability in this scenario pl0 is greater than the card tdp because the server operator knows that there is enough power in the server infrastructure to support greater than tdp for short time frames, even after reviewing the pl0 guardband. pl1 is less than tdp because the server operator knows that the power infrastructure cannot support greater than pl1 indefinitely. the server operator set the time window for pl0 at 50ms and the time window for pl1 at 300ms. this means that the smc carries a running average of pl0 and pl1. the pl0 running average is calculated over a 50ms window and the pl1 average is calculated over a 300ms window. the smc collects power sensor data at an update rate of 50ms, so pl0 is the last data collected from power sensors. the pl1 average is a moving average over the last 300ms. a 50ms power excursion over pl1 might not cause the smc to assert power alert if the pl1 time window is longer than 300ms. figure 6-2 shows a possible 12v card power profile. in time window a the server started to run a high-power application that consumes more power than pl1. power_alert does not assert because the 12v power's 300ms moving average is less than pl1, even though the 50ms moving average is greater than pl1. during time window b, the 50ms moving average is greater than pl0. when there is a pl0 violation, the smc will immediately assert thermal throttling (also known as prochot event) for 1 ? s. the ? os will manage clearing the intel ? xeon phi? coprocessor silicon and reducing the card operating frequency to the minimum value. the ? os has a routine that checks the source of the minimum frequency state and then takes actions to minimize future assertions of pl0. during time c the ? os services an interrupt and sets the maximum performance state so that future pl0 violations are unlikely to occur. the ? os then restores the card to its rated operating frequency and the power starts to ramp. during time d, the application does not ramp to pl0 because the ? os sets the maximum performance state to a lower state than in time a. however, the power draw is still greater than pl1. the smc asserts power_alert near the end of time window d because the 12v power 300ms moving average is greater than pl1. the ? os services the power_alert interrupt and takes appropriate action to lower the intel ? xeon phi? coprocessor power consumption. eventually the card power is under pl1. 6.6 out of band / pci express* smbus / ipmb management capabilities the intel ? xeon phi? coprocessor pci express* card exists as part of a system-level ecosystem. in order for this system to manage its cooling and power demands, the intel ? xeon phi? coprocessor telemetry must be exposed to ensure that the system is adequately cooled and that proper power is maintained. manageability code running elsewhere in the chassis, through the smc, can retrieve smc sensor logs, sensor data, and vital information required for robust server management. note that logging, in this context, is completely separate from and has nothing to do with the mca error log. the smc public interface (smbus) is a compliant ipmb interface. it supports a minimal ipmb command set in order to interact with manageability devices such as bmcs and the me (manageability engine). the ipmb implementation on the smc can receive additional incoming requests while responses are being processed. this enables the interleaving of requests and responses from multiple sources using the smc?s ipmb, thus minimizing latency.
manageability 58 reference number: 328209-001EN upon initial power-on or restart, the smc selects an ipmb slave address from the range 0x30 - 0x4e in increments of 2 (e.g., 0x30, 0x32, 0x34, etc.). the ipmb slave address self-select starting address is nonvolatile, starting at the last selected slave address. this ensures that the card doesn?t move nondeterministically in a static system. to determine the address of the intel ? xeon phi? coprocessor card scan the range of addresses issuing the get device id command for each address. a valid response indicates the address used is a valid address. for the intel ? xeon phi? coprocessor cards, the ipmb slave address will be found at 0x30 if only a single card is installed. if the motherboard has an exclusive connection to the smbus on each pci express* connection, then the intel ? xeon phi? coprocessor will assign itself a default address (0x30). if the smbus connections are shared, each intel ? xeon phi? coprocessor in a chassis will negotiate with each other and select addresses in the range from 0x30 to 0x4e. if a mux is incorporated into the design to isolate devices on a shared link the address negotiation process should result in each card having address 0x30. however, if the mux in use allows for the channels to be merged, i.e., creating a shared bus scenario, the address negotiation may result in each card having a unique address behind the mux. power management and power control are performed through the host driver interface (in-band). an sdk is provided as part of the intel ? xeon phi? coprocessor software stack and is named - micsyssw_oem.tar. the smc?s pci express*/smbus interface operates as an industry standard ipmi ipmb with a reduced ipmi command implementation. the smc supports a system event log (sel) via the ipmi interface. the smc supports a read only ipmi sdr. it is hardcoded and not end-user updateable. the sdr must be read in ?chunks?, suggested size is 16 bytes. the request to read the entire buffer will result in an error due to the buffer size is insufficient to return the complete sdr. 6.6.1 ipmb protocol the ipmb protocol is a symmetrical byte-level transport for transferring ipmi messages between intelligent i2c devices. it is a worldwide standard widely used in the server management industry. in this case, the client requests are sent to the smc with a master i2c write. although both devices are a master on the bus at different times, the smc only responds to requests. with the exception of the address selection algorithm, it does not initiate master transactions on the bus at any other time during normal operation. for byte level details, refer to the intelligent platform management bus communications protocol specification, v1.0 . 6.6.2 polled master-only protocol the polled master-only protocol may be used in the event ipmb is not feasible. the client sends requests to the smc using one or more smc smbus write block commands then, at a later time, reads the response using one or more smbus read block commands.
reference number: 328209-001EN 59 manageability 6.6.2.1 polled master-only protocol clarifications the polled master-only protocol is loosely based on the ipmi defined ssif protocol; however, there have been a few changes made and ambiguities clarified in order to make the protocol more reliable: ? the i2c address for the polled master-only protocol and the ipmb protocol are the same and work together transparently. ? pec bytes are required for all write commands and are returned with all valid read responses. ? the maximum smbus data length is restricted to 32 bytes. ? the smc ignores write commands that occur while it is internally processing a previous command. ? the smc does not return valid data while busy internally processing a command. ? a sequence number has been added to help identify the condition where a new write command (using the same netfn and command as the last command sent) was corrupted during transit. without this precaution, two sequential requests of the same type (i.e., get sensor reading) could result in one sensor?s reading being mistaken for the other?s. ? smbalert is not supported.
manageability 60 reference number: 328209-001EN 6.6.2.2 smbus write and read block command numbers 6.6.2.3 write description figure 6-3. write block command diagram table 6-1. smbus write commands command name command type 02h single part write write block 06h multi-part write start write block 07h multi-part write middle write block 08h multi-part write end write block 03h single read start read block 03h multi-part read start read block 09h multi-part read middle read block 09h multi-part read end read block
reference number: 328209-001EN 61 manageability 6.6.2.4 read description figure 6-4. read block command diagram 6.6.3 supported ipmi commands the smc supports a subset of the standard ipmi sensor, sel, and sdr commands along with several intel oem commands for accomplishing things like forcing throttle mode. the supported ipmi commands are documented in the following sections. standard ipmi details are not documented in this document. for those please refer to the ipmi v2.0 specification. for example the get sdr command requires additional bytes to complete the command packet and these bytes are defined in the ipmi v2.0 specification. 6.6.3.1 miscellaneous commands table 6-2. miscellaneous command details netfn command name app (0x06) 0x01 get device id app (0x06) 0x08 get device guid (uuid)
manageability 62 reference number: 328209-001EN 6.6.3.2 fru related commands 6.6.3.3 sdr related commands note: it is recommended to read the sdr in ?chunks? rather than request to read the entire record. see section 6.6 for more information. 6.6.3.4 sel related commands table 6-3. fru related command details netfn command name storage (0x0a) 0x10 get fru inventory area info storage (0x0a) 0x11 read fru data table 6-4. sdr related command details netfn command name storage (0x0a) 0x20 get sdr repository info storage (0x0a) 0x21 get sdr repository allocation info storage (0x0a) 0x23 get sdr table 6-5. sel related command details netfn command name storage (0x0a) 0x40 get sel info storage (0x0a) 0x41 get sel allocation info storage (0x0a) 0x43 get sel entry storage (0x0a) 0x47 clear sel storage (0x0a) 0x48 get sel time storage (0x0a) 0x49 set sel time
reference number: 328209-001EN 63 manageability 6.6.3.5 sensor related commands 6.6.3.6 general commands 6.6.3.6.1 cpu package configuration read the cpu package config read command reads power control data. for the parameter byte formats, refer to the intel xeon tm processor family external design specification (eds) volume 1 . table 6-6. sensor related command details netfn command name sensor (0x04) 0x2b get sensor event status sensor (0x04) 0x2d get sensor reading table 6-7. general command details netfn command name intel (0x2e) 0x42 cpu package config read intel (0x2e) 0x43 cpu package config write intel general app (0x30) 0x15 set sm signal table 6-8. cpu package config read request format byte # value description command 0x42 ? cpu package config read netfn 0x2e ? netfn_intel 0-2 ? manufacturer id (lsb format): 0x57, 0x01, 0x00 3 0x00 ? cpu number 4 0x?? ? pcs index ? 3 - accumulated energy status ? 11 - socket power throttle duration ? 26 - package power limit 1 (pl1) ? 27 - package power limit 2 (pl0) ? 28 - package power sku a ? 29 - package power sku b ? 30 - package power sku unit ? all other values reserved 5 0x00 ? parameter lsb 6 0x00 ? parameter msb 7 0x?? ? number of bytes to read
manageability 64 reference number: 328209-001EN 6.6.3.6.2 cpu package configuration write the cpu package config write command allows the setting of power control data. for the parameter byte formats, refer to the intel xeon processor family external design specification (eds) volume 1 . table 6-9. cpu package config read response format byte # value description 0 0x?? ? compcode ? 0x00 - normal ? 0xcc - invalid field ? 0xa1 - wrong cpu number ? 0xa7 - wrong read length ? 0xab - wrong command code ? 0xff - unspecified error 1-3 ? manufacturer id (lsb format): 0x57, 0x01, 0x00 4[-7] 0x?? ? data bytes read, up to 4 bytes table 6-10. cpu package config write request format byte # value description command 0x43 ? cpu package config write netfn 0x2e ? netfn_intel 0-2 ? manufacturer id (lsb format): 0x57, 0x01, 0x00 3 0x00 ? cpu number 4 0x?? ? pcs index ? 26 - package power limit 1 (pl1) ? 27 - package power limit 2 (pl0) ? all other values reserved 5 0x00 ? parameter lsb ? 6 0x00 ? parameter msb 7 0x?? ? number of bytes to write 8[-11] 0x?? ? data bytes to write table 6-11. cpu package config write response format byte # value description 0 0x?? ? compcode ? 0x00 - normal ? 0xc7 - request length invalid ? 0xcc - invalid field ? 0xa1 - wrong cpu number ? 0xa6 - wrong write length ? 0xab - wrong command code ? 0xff - unspecified error 1-3 ? manufacturer id (lsb format): 0x57, 0x01, 0x00
reference number: 328209-001EN 65 manageability 6.6.3.6.3 set sm signal the set sm signal command gives you control of firmware signals. the primary use of this command is to set the status led into identify mode. in identify mode the status led flashes on for a short period twice every 2 seconds. this allows an administrator to locate the card in a system that has multiple cards. 6.6.3.7 oem commands table 6-12. set sm signal request format byte # value description command 0x15 ? set sm signal netfn 0x30 ? netfn_intel_general_app 0 0x?? ? signal ? 1 - identify ? all other values reserved 1 0x00 ? instance 2 0x?? ? action ? if signal is 1 ? 1 - assert: start the identify blink code ? 2 - revert: return to normal operation ? all other values reserved [3] 0x00 ? value (optional) table 6-13. set sm signal response format byte # value description 0 0x?? ? compcode ? 0x00 - normal ? 0xc7 - request length invalid ? 0xc9 - parameter out of range ? 0xcc - invalid field table 6-14. oem command details netfn command name oem (0x3e) 0x00 ? oem set fan pwm adder oem (0x3e) 0x04 ? oem get post register oem (0x3e) 0x05 ? oem assert forced throttle oem (0x3e) 0x06 ? oem enable external throttle
manageability 66 reference number: 328209-001EN 6.6.3.7.1 oem set fan pwm adder the set fan pwm adder command allows a pwm percentage to be added to the final fan cooling algorithm for additional cooling based on chassis requirements. 6.6.3.7.2 oem get post register the get post register command allows the bmc to obtain the last post code written to the smc by the coprocessor. the smc does not modify this value in any way. 6.6.3.7.3 oem assert forced throttle the assert forced throttle command allows the bmc to cause the smc to assert the prochot pin to the coprocessor. table 6-15. set fan pwm adder command request format byte # value description command 0x00 ? oem set fan pwm adder netfn 0x3e ? netfn_oem 0 0x?? ? pwm percent to add to standard cooling: 0x00 - 0x64 ? all other values are reserved. table 6-16. set fan pwm adder command response format byte # value description 0 0x?? ? compcode ? 0x00 - normal ? 0xc9 - parameter out of range table 6-17. get post register request format byte # value description command 0x04 ? oem get post register netfn 0x3e ? netfn_oem table 6-18. get post register response format byte # value description 0 0x?? ? compcode ? 0x00 - normal 1-4 0x?? ? 32 bit post code in little endian format
reference number: 328209-001EN 67 manageability 6.6.3.7.4 oem enable external throttle the enable external throttle command causes the smc to enable a pin on the baseboard connector allowing the baseboard to directly assert the prochot signal. 6.6.3.8 other ipmi related information the smc supports a read only ipmi fru.the smc system event log is a circular log supporting a minimum of 64 log entries. it is resilient to corruption, retaining information across unexpected power loss. the sensor names in the ipmi sensor data record are static and do not change from release to release. the ipmi sensor numbers may change and hence should be discovered during the normal sensor discovery process. sensors may be added in the future, but the previously defined sensor names will not change. table 6-19. assert forced throttle request format byte # value description command 0x05 ? oem assert forced throttle netfn 0x3e ? netfn_oem 0 0x?? ? 0 = deassert forced throttle ? 1 - assert forced throttle ? all other values are reserved table 6-20. assert forced throttle response format byte # value description 0 0x?? ? compcode ? 0x00 - normal table 6-21. enable external throttle request format byte value description command 0x06 ? oem enable external throttle netfn 0x3e ? netfn_oem 0 0x?? ? 0 = disable external throttle signal ? 1 = enable external throttle signal ? all other values are reserved table 6-22. enable external throttle response format byte value description 0 0x?? ? compcode ? 0x00 - normal ? 0xc0 - busy
manageability 68 reference number: 328209-001EN reading the sdr returns the sensors available on the card. there will be a sensor number and sensor name associated with each sensor. once the sensor name is determined it can be used in the management firmware for reading a particular sensor by discovering the sensor number associated with the sensor name. it is strongly recommended to use the sensor name if it is ?hard coded? into the management firmware. sensor numbers should not be hard coded as the sensor numbers are subject to change. incorporating a sensor number in the management firmware as a hard coded value could result in incorrect values with subsequent releases of the smc firmware. using the sensor name and discovering the sensor number associated with a sensor name will ensure the correct sensor is read and returns valid data with each future release of the smc firmware. the following table is a list of the current sensor names. table 6-23. list of sensor names on the intel? xeon phi? coprocessor sensor name sensor function power power_pcie power measured at the pci express* edge finger input power_2x3 power measured at the 2x3 auxiliary connector input power_2x4 power measured at the 2x4 auxiliary connector input power_pv power output reported by vr supplying power to coprocessor power_vddq power output reported by vr supplying power to coprocessor power_vddg power output reported by vr supplying power to memory and other circuitry avg_power0 average power consumption over limit time window 0 avg_power1 average power consumption over limit time window 0 instpwr instantaneous power consumption reading instpwrmax maximum instantaneous power consumption observed voltage pv_volt voltage reported from vr supplying power to coprocessor vddq_volt voltage reported from vr supplying power to coprocessor vddg_volt voltage reported from vr supplying power to memory and other circuitry temperature east_temp temperature sensor on the eastern-most side of the board gddr_temp temperature sensor closest to the gddr memory devices west_temp temperature sensor on the western-most side of the board pv_vrtemp temperature reported from vr supplying power to coprocessor vddq_temp temperature reported from vr supplying power to coprocessor vddg_temp temperature reported from vr supplying power to memory and other circuitry proc_temp temperature reported by the coprocessor (junction temperature) exhst_temp highest of discrete temperature sensors on the board inlet_temp lowest of discrete temperature sensors on the board fan fan_pwm fan pwm driven by smc software (only on active sku) fan_tach fan tach read by smc (only on active sku.)
reference number: 328209-001EN 69 manageability the smc implements the ability to read all smc-based sensors via the get sensor reading command. the sensor number to be sent with the command must be discovered and not hard coded in the firmware as this can lead to incorrect readings or returning invalid sensor errors. 6.6.3.9 smc ipmi discrete sensors the smc?s ipmi discrete sensors are defined here because the meaning of each discrete bit cannot be easily derived from the sdr definition. 6.6.3.9.1 sensor status the status sensor reports the state of several critical signals on the card such as thermtrip, vr phase, fault, vr hot, uv/ov alert, and pci express* reset. the sensor is not mirrored as a register on the in-band register interface. 6.7 smc led_error and fan pwm the smc firmware drives the led_error pin as follows other status critical signal states ( section 6.6.3.9 .) tcritical value reported by coprocessor for thermal monitoring tcontrol value reported by coprocessor for stem fan control table 6-23. list of sensor names on the intel? xeon phi? coprocessor sensor name sensor function table 6-24. status sensor report format bits name description 31:7 reserved ? reserved 6 p2e_rst ? pci express* reset asserted. ? fans boosted. 5 p12v_uvov ? p12v under-voltage/over-voltage signal asserted. ? fans boosted and vr output disabled. 4 vr2_hot ? vr2 hot signal asserted. fans boosted and prochot asserted. 3 vr1_hot ? vr1 hot signal asserted. fans boosted and prochot asserted. 2 vr2_phsflt ? vr2 phase fault asserted. ? fans boosted and vr output disabled. ? this state is latched until power-off. 1 vr1_phsflt ? vr1 phase fault asserted. ? fans boosted and vr output disabled. ? this state is latched until power-off. 0 thermtrip ? coprocessor thermtrip asserted. ? fans boosted and vr output disabled. ? this state is latched until power-off.
manageability 70 reference number: 328209-001EN : the smc drives the fan pwm to the static rate provided in the ipmi fru while in boot loader mode. table 6-1. led indicators blink frequency condition 0.5hz blink ? in boot loader mode 2hz blink ? firmware update in progress 8hz blink ? operational code executing identify blink ? 2 short blinks every 2 seconds. ? initiated by setsmsignal command.


▲Up To Search▲   

 
Price & Availability of 328209-001EN

All Rights Reserved © IC-ON-LINE 2003 - 2022  

[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy]
Mirror Sites :  [www.datasheet.hk]   [www.maxim4u.com]  [www.ic-on-line.cn] [www.ic-on-line.com] [www.ic-on-line.net] [www.alldatasheet.com.cn] [www.gdcy.com]  [www.gdcy.net]


 . . . . .
  We use cookies to deliver the best possible web experience and assist with our advertising efforts. By continuing to use this site, you consent to the use of cookies. For more information on cookies, please take a look at our Privacy Policy. X